Dane

Opis danych

Link do Kaggle
Zbiór danych: Cirrhosis Prediction Dataset(cirrhosis.csv)
Nazwa zbioru danych: Cirrhosis Prediction Dataset
Opis zbioru danych: Zbiór danych “Cirrhosis-Prediction” zawiera informacje dotyczące pacjentów z marskością wątroby. Dane te są wykorzystywane do analizy i przewidywania stopnia zaawansowania choroby na podstawie różnych biomarkerów i objawów klinicznych.

Stage

Typ zmiennej: Liczba całkowita (integer)
Opis zmiennej: Stopień zaawansowania marskości wątroby.
Zakres wartości: Zmienna przyjmuje wartości całkowite, gdzie wyższa liczba wskazuje na bardziej zaawansowany etap choroby.

Bilirubin

Typ zmiennej: Liczba rzeczywista (float)
Opis zmiennej: Poziom bilirubiny we krwi, mierzony w miligramach na decylitr (mg/dL).
Zakres wartości: Zmienna przyjmuje wartości ciągłe.

Albumin

Typ zmiennej: Liczba rzeczywista (float)
Opis zmiennej: Poziom albuminy we krwi, mierzony w gramach na decylitr (g/dL).
Zakres wartości: Zmienna przyjmuje wartości ciągłe.

Prothrombin

Typ zmiennej: Liczba rzeczywista (float)
Opis zmiennej: Czas protrombinowy (PT), mierzony w sekundach.
Zakres wartości: Zmienna przyjmuje wartości ciągłe.

Ascites

Typ zmiennej: Liczba całkowita (integer)
Opis zmiennej: Obecność płynu w jamie brzusznej (wodobrzusze).
Zakres wartości: Zmienna przyjmuje wartości N (brak) lub Y (obecność).

Hepatomegaly

Typ zmiennej: Liczba całkowita (integer)
Opis zmiennej: Powiększenie wątroby.
Zakres wartości: Zmienna przyjmuje wartości N (brak) lub Y (obecność).

Spiders

Typ zmiennej: Liczba całkowita (integer)
Opis zmiennej: Obecność pajączków naczyniowych (teleangiektazje).
Zakres wartości: Zmienna przyjmuje wartości N (brak) lub Y (obecność).

Edema

Typ zmiennej: Liczba całkowita (integer)
Opis zmiennej: Obecność obrzęków.
Zakres wartości: Zmienna przyjmuje wartości N (brak), S (umiarkowane) lub Y (znaczne).

Wczytanie danych

setwd('C:/Users/mateu/Desktop/Studia/6 semestr/Wnioskowanie w warunkach niepewności/')
data <- read.csv('cirrhosis.csv')

Wybieramy tylko 300 pierwszych wierszy

data <- head(data, 300)
str(data)
## 'data.frame':    300 obs. of  20 variables:
##  $ ID           : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ N_Days       : int  400 4500 1012 1925 1504 2503 1832 2466 2400 51 ...
##  $ Status       : chr  "D" "C" "D" "D" ...
##  $ Drug         : chr  "D-penicillamine" "D-penicillamine" "D-penicillamine" "D-penicillamine" ...
##  $ Age          : int  21464 20617 25594 19994 13918 24201 20284 19379 15526 25772 ...
##  $ Sex          : chr  "F" "F" "M" "F" ...
##  $ Ascites      : chr  "Y" "N" "N" "N" ...
##  $ Hepatomegaly : chr  "Y" "Y" "N" "Y" ...
##  $ Spiders      : chr  "Y" "Y" "N" "Y" ...
##  $ Edema        : chr  "Y" "N" "S" "S" ...
##  $ Bilirubin    : num  14.5 1.1 1.4 1.8 3.4 0.8 1 0.3 3.2 12.6 ...
##  $ Cholesterol  : int  261 302 176 244 279 248 322 280 562 200 ...
##  $ Albumin      : num  2.6 4.14 3.48 2.54 3.53 3.98 4.09 4 3.08 2.74 ...
##  $ Copper       : int  156 54 210 64 143 50 52 52 79 140 ...
##  $ Alk_Phos     : num  1718 7395 516 6122 671 ...
##  $ SGOT         : num  137.9 113.5 96.1 60.6 113.2 ...
##  $ Tryglicerides: int  172 88 55 92 72 63 213 189 88 143 ...
##  $ Platelets    : int  190 221 151 183 136 NA 204 373 251 302 ...
##  $ Prothrombin  : num  12.2 10.6 12 10.3 10.9 11 9.7 11 11 11.5 ...
##  $ Stage        : int  4 3 4 4 3 3 3 3 2 4 ...

Wybór odpowiednich kolumn

data = data[, c("Stage", "Bilirubin", "Albumin", "Prothrombin", "Ascites", "Hepatomegaly", "Spiders", "Edema")]

Użyte biblioteki

Rgraphviz

Rgraphviz to biblioteka umożliwiająca wizualizację grafów w R, bazująca na oprogramowaniu Graphviz.

gRain

gRain to biblioteka do modelowania probabilistycznego, szczególnie używana do pracy z grafami nieskierowanymi i skierowanymi, takimi jak sieci bayesowskie i markowskie.

bnlearn

bnlearn to biblioteka przeznaczona do uczenia się struktur sieci bayesowskich z danych oraz do przeprowadzania wnioskowania probabilistycznego w takich sieciach.

lattice

lattice to biblioteka do tworzenia zaawansowanych wykresów trellisowych, umożliwiająca wizualizację danych w sposób wielowymiarowy.

library(lattice)
library(Rgraphviz)
## Loading required package: graph
## Loading required package: BiocGenerics
## 
## Attaching package: 'BiocGenerics'
## The following objects are masked from 'package:stats':
## 
##     IQR, mad, sd, var, xtabs
## The following objects are masked from 'package:base':
## 
##     anyDuplicated, aperm, append, as.data.frame, basename, cbind,
##     colnames, dirname, do.call, duplicated, eval, evalq, Filter, Find,
##     get, grep, grepl, intersect, is.unsorted, lapply, Map, mapply,
##     match, mget, order, paste, pmax, pmax.int, pmin, pmin.int,
##     Position, rank, rbind, Reduce, rownames, sapply, setdiff, sort,
##     table, tapply, union, unique, unsplit, which.max, which.min
## Loading required package: grid
library(bnlearn)
library(gRain)
## Loading required package: gRbase
## 
## Attaching package: 'gRbase'
## The following objects are masked from 'package:bnlearn':
## 
##     ancestors, children, nodes, parents
## The following objects are masked from 'package:graph':
## 
##     addEdge, adj, connComp, edges, nodes, removeEdge, subGraph

Pogrupowanie danych

Funkcja do grupowania danych

generate_groups <- function(data, column_name) {
  min_val <- min(data[[column_name]], na.rm = TRUE)
  max_val <- max(data[[column_name]], na.rm = TRUE)
  diff_val <- max_val - min_val
  
  groups <- cut(data[[column_name]], 
                breaks = c(min_val - 1, 
                           min_val + 0.333 * diff_val, 
                           min_val + 0.667 * diff_val, 
                           max_val + 1), 
                labels = c(sprintf("(%.2f, %.2f]", min_val - 1, 
                                   min_val + 0.333 * diff_val), 
                           sprintf("(%.2f, %.2f]", min_val + 0.333 * diff_val, 
                                   min_val + 0.667 * diff_val), 
                           sprintf("(%.2f, %.2f]", min_val + 0.667 * diff_val, 
                                   max_val + 1))) 
  
  return(groups)
}

Pogrupowaie Danych

# Bilirubin
data$Bilirubin <- generate_groups(data, 'Bilirubin')

(data$Bilirubin)
##   [1] (9.52, 18.78]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
##   [6] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (9.52, 18.78] 
##  [11] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
##  [16] (-0.70, 9.52]  (-0.70, 9.52]  (9.52, 18.78]  (-0.70, 9.52]  (-0.70, 9.52] 
##  [21] (-0.70, 9.52]  (-0.70, 9.52]  (9.52, 18.78]  (-0.70, 9.52]  (-0.70, 9.52] 
##  [26] (-0.70, 9.52]  (18.78, 29.00] (9.52, 18.78]  (-0.70, 9.52]  (-0.70, 9.52] 
##  [31] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
##  [36] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
##  [41] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
##  [46] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
##  [51] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
##  [56] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
##  [61] (-0.70, 9.52]  (-0.70, 9.52]  (18.78, 29.00] (-0.70, 9.52]  (-0.70, 9.52] 
##  [66] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (18.78, 29.00] (-0.70, 9.52] 
##  [71] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (9.52, 18.78] 
##  [76] (9.52, 18.78]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
##  [81] (9.52, 18.78]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
##  [86] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
##  [91] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (9.52, 18.78] 
##  [96] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [101] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [106] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [111] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [116] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [121] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [126] (-0.70, 9.52]  (-0.70, 9.52]  (9.52, 18.78]  (-0.70, 9.52]  (9.52, 18.78] 
## [131] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [136] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [141] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (18.78, 29.00] (-0.70, 9.52] 
## [146] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [151] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [156] (18.78, 29.00] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [161] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [166] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [171] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [176] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [181] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (9.52, 18.78]  (-0.70, 9.52] 
## [186] (-0.70, 9.52]  (9.52, 18.78]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [191] (18.78, 29.00] (-0.70, 9.52]  (9.52, 18.78]  (-0.70, 9.52]  (-0.70, 9.52] 
## [196] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [201] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [206] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [211] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [216] (-0.70, 9.52]  (9.52, 18.78]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [221] (-0.70, 9.52]  (-0.70, 9.52]  (9.52, 18.78]  (-0.70, 9.52]  (-0.70, 9.52] 
## [226] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [231] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (9.52, 18.78] 
## [236] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [241] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [246] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [251] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [256] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [261] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [266] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [271] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [276] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [281] (9.52, 18.78]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [286] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [291] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## [296] (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52]  (-0.70, 9.52] 
## Levels: (-0.70, 9.52] (9.52, 18.78] (18.78, 29.00]
# Albumin
data$Albumin <- generate_groups(data, 'Albumin')

# Prothrombin
data$Prothrombin <- generate_groups(data, 'Prothrombin')

Zmiana typu danych na factor

data[] <- lapply(data, factor)

str(data)
## 'data.frame':    300 obs. of  8 variables:
##  $ Stage       : Factor w/ 4 levels "1","2","3","4": 4 3 4 4 3 3 3 3 2 4 ...
##  $ Bilirubin   : Factor w/ 3 levels "(-0.70, 9.52]",..: 2 1 1 1 1 1 1 1 1 2 ...
##  $ Albumin     : Factor w/ 3 levels "(0.96, 2.85]",..: 1 3 2 1 2 3 3 3 2 1 ...
##  $ Prothrombin : Factor w/ 3 levels "(8.00, 11.70]",..: 2 1 2 1 1 1 1 1 1 1 ...
##  $ Ascites     : Factor w/ 2 levels "N","Y": 2 1 1 1 1 1 1 1 1 2 ...
##  $ Hepatomegaly: Factor w/ 2 levels "N","Y": 2 2 1 2 2 2 2 1 1 1 ...
##  $ Spiders     : Factor w/ 2 levels "N","Y": 2 2 1 2 2 1 1 1 2 2 ...
##  $ Edema       : Factor w/ 3 levels "N","S","Y": 3 1 2 2 1 1 1 1 1 3 ...

Zbadanie niezależności

pairwise_ci_test <- function(df) {
  n <- ncol(df)
  results <- data.frame(
    Comparison = character(),
    p_value = numeric(),
    Independence = character(),
    stringsAsFactors = FALSE
  )
  
  for (i in 1:(n-1)) {
    for (j in (i+1):n) {
      col1 <- df[[i]]
      col2 <- df[[j]]
      
      if (!is.factor(col1)) col1 <- as.factor(col1)
      if (!is.factor(col2)) col2 <- as.factor(col2)
      
      test_result <- ci.test(col1, col2, test = "x2")
      comparison_name <- paste(names(df)[i], "vs", names(df)[j])
      independence <- ifelse(test_result$p.value < 0.05, "zalezne", "niezalezne")
      results <- rbind(results, data.frame(
        Comparison = comparison_name,
        p_value = test_result$p.value,
        Independence = independence
      ))
    }
  }
  
  return(results)
}
results_df <- pairwise_ci_test(data)
results_df
##                     Comparison      p_value Independence
## 1           Stage vs Bilirubin 5.933843e-02   niezalezne
## 2             Stage vs Albumin 7.316962e-06      zalezne
## 3         Stage vs Prothrombin 8.839696e-06      zalezne
## 4             Stage vs Ascites 8.717864e-07      zalezne
## 5        Stage vs Hepatomegaly 4.620397e-15      zalezne
## 6             Stage vs Spiders 7.030103e-06      zalezne
## 7               Stage vs Edema 6.112525e-04      zalezne
## 8         Bilirubin vs Albumin 5.751040e-08      zalezne
## 9     Bilirubin vs Prothrombin 2.201267e-09      zalezne
## 10        Bilirubin vs Ascites 1.295394e-09      zalezne
## 11   Bilirubin vs Hepatomegaly 5.913439e-03      zalezne
## 12        Bilirubin vs Spiders 7.903579e-04      zalezne
## 13          Bilirubin vs Edema 6.346790e-12      zalezne
## 14      Albumin vs Prothrombin 2.643504e-04      zalezne
## 15          Albumin vs Ascites 3.208900e-13      zalezne
## 16     Albumin vs Hepatomegaly 1.121205e-04      zalezne
## 17          Albumin vs Spiders 1.891792e-03      zalezne
## 18            Albumin vs Edema 1.014364e-12      zalezne
## 19      Prothrombin vs Ascites 1.664293e-06      zalezne
## 20 Prothrombin vs Hepatomegaly 6.378526e-03      zalezne
## 21      Prothrombin vs Spiders 1.137241e-04      zalezne
## 22        Prothrombin vs Edema 2.802099e-14      zalezne
## 23     Ascites vs Hepatomegaly 5.480935e-03      zalezne
## 24          Ascites vs Spiders 4.614854e-03      zalezne
## 25            Ascites vs Edema 5.013400e-26      zalezne
## 26     Hepatomegaly vs Spiders 4.843569e-07      zalezne
## 27       Hepatomegaly vs Edema 1.094596e-02      zalezne
## 28            Spiders vs Edema 1.001962e-05      zalezne

Tworzenie sieci

Sieć HC

siec_hc <- hc(data)      
graphviz.plot(siec_hc)

Siec IAMB

siec_iamb <- iamb(data)
siec_iamb <- set.arc(siec_iamb, "Albumin", "Ascites")
siec_iamb <- set.arc(siec_iamb, "Ascites", "Stage")
siec_iamb <- set.arc(siec_iamb, "Stage", "Bilirubin")
siec_iamb <- set.arc(siec_iamb, "Ascites", "Edema")
siec_iamb <- set.arc(siec_iamb, "Stage", "Hepatomegaly")
siec_iamb <- set.arc(siec_iamb, "Hepatomegaly", "Spiders")
siec_iamb <- set.arc(siec_iamb, "Edema", "Prothrombin")
graphviz.plot(siec_iamb)

FAST IAMB

siec_fast_iamb <- fast.iamb(data)
siec_fast_iamb <- set.arc(siec_fast_iamb, "Bilirubin", "Albumin")
siec_fast_iamb <- set.arc(siec_fast_iamb, "Stage", "Ascites")
siec_fast_iamb <- set.arc(siec_fast_iamb, "Stage", "Bilirubin")
siec_fast_iamb <- set.arc(siec_fast_iamb, "Ascites", "Edema")
siec_fast_iamb <- set.arc(siec_fast_iamb, "Stage", "Hepatomegaly")
siec_fast_iamb <- set.arc(siec_fast_iamb, "Hepatomegaly", "Spiders")
siec_fast_iamb <- set.arc(siec_fast_iamb, "Edema", "Prothrombin")
graphviz.plot(siec_fast_iamb) 

Sieć PC STABLE

siec_pc_stable <- pc.stable(data)
siec_pc_stable <- set.arc(siec_pc_stable, "Albumin", "Ascites")
siec_pc_stable <- set.arc(siec_pc_stable, "Stage", "Ascites")
siec_pc_stable <- set.arc(siec_pc_stable, "Stage", "Bilirubin")
siec_pc_stable <- set.arc(siec_pc_stable, "Ascites", "Edema")
siec_pc_stable <- set.arc(siec_pc_stable, "Stage", "Hepatomegaly")
siec_pc_stable <- set.arc(siec_pc_stable, "Hepatomegaly", "Spiders")
siec_pc_stable <- set.arc(siec_pc_stable, "Edema", "Prothrombin")
graphviz.plot(siec_pc_stable)

Sieć GS

siec_gs <- tabu(data)
graphviz.plot(siec_gs)

Wybór sieci

score(siec_hc, data = data, type = "bic")
## [1] -1413.401
score(siec_iamb, data = data, type = "bic")
## [1] -1430.471
score(siec_fast_iamb, data = data, type = "bic")
## [1] -1440.144
score(siec_pc_stable, data = data, type = "bic")
## [1] -1450.361
score(siec_gs, data = data, type = "bic")
## [1] -1413.401

Wybieram siec HC

Wyznaczenie rozkładów prawdopodobieństwa

(bn <- bn.fit(siec_hc, data=data))
## 
##   Bayesian network parameters
## 
##   Parameters of node Stage (multinomial distribution)
## 
## Conditional probability table:
##  
##      Ascites
## Stage          N          Y
##     1 0.05797101 0.00000000
##     2 0.20652174 0.08333333
##     3 0.42391304 0.04166667
##     4 0.31159420 0.87500000
## 
##   Parameters of node Bilirubin (multinomial distribution)
## 
## Conditional probability table:
##  
##                 Ascites
## Bilirubin                 N          Y
##   (-0.70, 9.52]  0.94927536 0.58333333
##   (9.52, 18.78]  0.03623188 0.33333333
##   (18.78, 29.00] 0.01449275 0.08333333
## 
##   Parameters of node Albumin (multinomial distribution)
## 
## Conditional probability table:
##  (0.96, 2.85] (2.85, 3.75] (3.75, 5.64] 
##   0.06333333   0.64000000   0.29666667 
## 
##   Parameters of node Prothrombin (multinomial distribution)
## 
## Conditional probability table:
##  
##                 Edema
## Prothrombin                N           S           Y
##   (8.00, 11.70]  0.916334661 0.655172414 0.300000000
##   (11.70, 14.40] 0.075697211 0.344827586 0.700000000
##   (14.40, 18.10] 0.007968127 0.000000000 0.000000000
## 
##   Parameters of node Ascites (multinomial distribution)
## 
## Conditional probability table:
##  
##        Albumin
## Ascites (0.96, 2.85] (2.85, 3.75] (3.75, 5.64]
##       N   0.47368421   0.93229167   0.98876404
##       Y   0.52631579   0.06770833   0.01123596
## 
##   Parameters of node Hepatomegaly (multinomial distribution)
## 
## Conditional probability table:
##  
##             Stage
## Hepatomegaly         1         2         3         4
##            N 1.0000000 0.7118644 0.5593220 0.1869159
##            Y 0.0000000 0.2881356 0.4406780 0.8130841
## 
##   Parameters of node Spiders (multinomial distribution)
## 
## Conditional probability table:
##  
##        Hepatomegaly
## Spiders         N         Y
##       N 0.8472222 0.5833333
##       Y 0.1527778 0.4166667
## 
##   Parameters of node Edema (multinomial distribution)
## 
## Conditional probability table:
##  
##      Ascites
## Edema          N          Y
##     N 0.88768116 0.25000000
##     S 0.09057971 0.16666667
##     Y 0.02173913 0.58333333
bn_grain <- as.grain(bn)
bn.fit.barchart(bn$Stage)

bn.fit.barchart(bn$Bilirubin)

bn.fit.barchart(bn$Albumin)

bn.fit.barchart(bn$Prothrombin)

bn.fit.barchart(bn$Ascites)

bn.fit.barchart(bn$Hepatomegaly)

bn.fit.barchart(bn$Spiders)

bn.fit.barchart(bn$Edema)

Prawdopodobieństwa

Prawdopodobieństwo wystąpienia Ascites dla pacjentów w wieku 16-20:

TEST_adplvl <- setEvidence(bn_grain, nodes = c('Age'), states = c('16-20'))
TEST_adplvl_wynik <- querygrain(TEST_adplvl, nodes = c('Ascites'), result = 'array')$Ascites
print(TEST_adplvl_wynik)
## Ascites
##    N    Y 
## 0.92 0.08

Bardzo małe prawdopodobieństwo wystąpienia Ascites dla pacjentów w wieku 16-20, wynosi 0.08

Prawdopodobieństwo wystąpienia Spiders dla pacjentów z poziomem prothrombiny poniżej 12:

TEST2_adplvl <- setEvidence(bn_grain, nodes = c('Prothrombin'), states = c('<12'))
TEST2_adplvl_wynik <- querygrain(TEST2_adplvl, nodes = c('Spiders'), result = 'array')$Spiders
print(TEST2_adplvl_wynik)
## Spiders
##    N    Y 
## 0.71 0.29

Małe prawdopodobieństwo wystąpienia Spiders dla pacjentów z poziomem prothrombiny poniżej 12, wynosi 0.29

Prawdopodobieństwo wystąpienia Prothrombin poniżej 12 dla pacjentów z obecnością Edema:

TEST3_adplvl <- setEvidence(bn_grain, nodes = c('Stage'), states = c('4'))
TEST3_adplvl_wynik <- querygrain(TEST3_adplvl, nodes = c('Hepatomegaly'), result = 'array')$Hepatomegaly
print(TEST3_adplvl_wynik)
## Hepatomegaly
##         N         Y 
## 0.1869159 0.8130841

Duże prawdopodobieństwo wystąpienia Prothrombin poniżej 12 dla pacjentów z obecnością Edema, wynosi 0.83